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Abstract 

We  give  an  efficient  algorithm  for  partitioning  the  domain  of  a  numeric  function  /  into 
segments.  The  function  /  is  realized  as  a  polynomial  in  each  segment,  and  a  lookup  table 
stores  the  coefficients  of  the  polynomial.  Such  an  algorithm  is  an  essential  part  of  the 
design  of  lookup  table  methods  [5,8,9,12,14,15]  for  realizing  numeric  functions,  such  as 
sin{ nx),  ln(x ),  and  \J—ln(x).  Our  algorithm  requires  many  fewer  steps  than  a  previous 
algorithm  given  in  [6]  and  makes  tractable  the  design  of  numeric  function  generators  based 
on  table  lookup  for  high-accuracy  applications.  We  show  that  an  estimate  of  segment 
width  based  on  local  derivatives  greatly  reduces  the  search  needed  to  determine  the  exact 
segment  width.  We  apply  the  new  algorithm  to  a  suite  of  15  numeric  functions  and 
show  that  the  estimates  are  sufficiently  accurate  to  produce  a  minimum  or  near-minimum 
number  of  computational  steps. 

Keywords:  piecewise  linear  approximation;  numeric  function  generators;  segmentation 
algorithm 

1.  Introduction 

The  existence  of  large  logic  circuits  has  led  to  increased  interest  in  an  old  problem  - 
the  realization  of  numeric  functions.  More  than  150  years  ago,  Babbage  designed  the 
difference  engine  to  automatically  compute  logarithmic  and  trigonometric  functions  [1]. 
This  was  intended  to  replace  hand  computation  which  was  prone  to  error. 

The  availability  of  circuits  to  compute  quickly  functions  like  sin(x )  and  log(x )  offers 
real-time  execution  of  algorithms  that  can  be  used  in  applications  such  as  the  rendering 
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Figure  1.  Architecture  of  a  numerical  function  generator  using  a  piecewise  polynomial  ap¬ 
proximation. 

of  graphics  or  digital  signal  processing. 

In  this  paper  we  give  a  new,  efficient  algorithm  for  partitioning  the  domain  of  a  numeric 
function  /  into  segments.  Within  each  segment,  the  function  /  is  realized  as  a  polyno¬ 
mial  with  a  lookup  table  storing  the  coefficients  of  the  polynomial.  We  use  an  estimate 
of  segment  width  based  on  local  derivatives  to  greatly  reduce  the  search  needed  to  deter¬ 
mine  the  exact  optimal  segment  width.  We  then  apply  our  new  algorithm  to  a  suite  of 
15  numeric  functions,  showing  that  the  estimates  are  sufficiently  accurate  to  produce  a 
minimum  or  near-minimum  number  of  steps  in  the  computation. 

We  note  that  lookup  tables  have  been  used  previously  to  implement  a  truncated  series 
expression  approximation  of  the  given  function.  Hassler  and  Takagi  [7]  represent  the 
function  by  a  converging  series  and  replace  the  single  large  memory  by  two  or  more 
smaller  lookup  tables.  Stine  and  Schulte  [16,17]  use  the  Taylor  series  expansion  of  a 
differentiable  function.  The  first  two  terms  of  the  expansion  are  realized  by  using  small 
lookup  tables.  The  reciprocal,  square  root,  inverse  square  root,  and  certain  elementary 
functions  were  realized  by  Ercegovac,  Lang,  Muller,  and  Tisserand  [5]  using  a  Taylor 
expansion  and  tables.  Lookup  tables  have  been  used  in  the  implementation  of  logarithm 
and  antilogarithm  computations  by  Paul,  Jayakumar,  and  Khatri  [14]. 

Lee,  Luk,  Villasenor,  and  Cheung  [8,9]  realize  trigonometric  and  logarithmic  functions 
by  table  lookup  using  a  non-uniform  segmentation  method.  In  their  algorithm,  narrow 
segments  are  used  where  the  change  in  the  function  is  large,  and  wide  segments  are  used 
where  the  change  in  the  function  is  small. 

Sasao,  Butler,  and  Riedel  [15]  use  the  Douglas-Peucker  [4]  algorithm  to  partition  a 
given  function  into  segments  that  are  realized  by  a  linear  approximation.  They  show 
that  a  circuit  producing  a  non-uniform  segmentation  has  a  tractable  realization  for  com¬ 
mon  numeric  functions.  Unfortunately,  the  Douglas-Peucker  algorithm  does  not  produce 
optimum  segmentations  [6]. 

Fig.  1  shows  the  architecture  of  the  numeric  function  generator  (NFG)  that  realizes  a 
given  function  as  a  piecewise  polynomial  approximation  [11].  It  consists  of  three  blocks. 
The  Segment  Number  Generator  uses  the  value  of  x  to  generate  a  segment  number  that 
is  applied  to  the  address  input  of  the  Coefficients  Memory.  The  Coefficients  Memory  pro¬ 
duces  the  coefficients  in  the  polynomial  expression  for  the  given  function.  The  piecewise 
polynomial  approximation  f(x)  ~  cmxm  +  •  •  •  +  C\X  +  Cq  is  computed  by  the  Polynomial 
Circuit  in  Fig.  1  using  the  coefficients  produced  by  the  Coefficients  Memory. 
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Each  segment  in  the  function  domain  corresponds  to  a  word  in  the  memory  which  stores 
the  polynomial  coefficients  for  the  function  approximation  in  that  segment.  For  a  given 
approximation  error,  we  seek  a  segmentation  of  the  domain  that  has  the  fewest  segments 
possible.  This  minimizes  the  memory  required  for  the  lookup  table. 

The  algorithm  given  in  this  paper  efficiently  divides  the  domain  of  /  into  segments 
so  that  the  the  error  in  polynomial  approximation  in  each  segment  is  no  greater  than 
a  specified  error.  In  [6],  we  presented  an  algorithm  that  produced  a  segmentation  with 
the  fewest  segments.  However,  this  algorithm  can  be  computationally  intensive,  with  a 
computation  time  sometimes  measured  in  days  or  weeks.  Although  applied  only  once 
in  the  synthesis  of  a  numeric  function  generator,  the  previous  algorithm  can  make  high 
accuracy  applications  impractical.  Our  main  result,  the  new  algorithm  presented  here,  is 
orders  of  magnitude  faster  and  still  yields  the  fewest  steps. 

While  the  proposed  segmentation  algorithm  applies  to  any  order  approximating  polyno¬ 
mial,  our  experimental  results  focus  on  linear  and  quadratic  approximations.  Nagayama, 
Sasao,  and  Butler  [12]  show  that  presently  available  held  programmable  gate  arrays  (FP- 
GAs)  have  insufficient  arithmetic  elements,  such  as  multipliers,  to  efficiently  implement 
third  or  higher  order  polynomials.  As  FPGA  technology  improves,  this  may  change. 

2.  Background 

Because  the  variable  x  and  the  function’s  value  f(x)  are  represented  as  binary  num¬ 
bers  with  a  fixed  number  of  bits,  a  numeric  function  generator’s  output  is  inherently  an 
approximation  of  the  exact  function  value.  While  we  may  view  the  value  of  x  as  exact,  it 
may  not  be  possible  to  view  f(x)  as  exact.  For  example,  consider  the  function  f(x)  =  ^/a;. 
If  x  =  2,  we  can  realize  2  exactly.  However,  the  irrationality  of  y/2  means  that  its  exact 
value  cannot  be  realized  in  finitely  many  bits. 

3.  Estimating  the  Segment  Width 

An  essential  part  of  the  new  segmentation  algorithm  is  deriving  an  estimate  of  the 
segment  width.  An  accurate  estimate  is  essential,  because  subsequently  a  search  must 
be  performed  for  the  exact  segment  width.  Later,  we  analyze  the  estimate’s  accuracy 
and  show  that,  in  many  cases,  it  is  as  accurate  as  it  can  possibly  be.  First,  we  focus  on 
deriving  the  estimate. 

Let  the  segment  over  which  we  seek  an  nth-order  polynomial  approximation  span  [e,  s]. 
The  maximum  approximation  error  £  of  a  Chebyshev  approximation  [10]  is 


2(e  -  s)n+1  ,(n+1) 

£  =  - — — - — t  max  |/1  +  ’ 


»l- 


- — 777 - 777  max 

4"+i(n  +  1)!  s<x<e 
Solving  (1)  for  the  segment  width,  e  —  s  yields 


(1) 


e  —  s  =  4 
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l)!e 


2  maxs<x<e  |/(n+1)(x)| 
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For  the  two  special  cases  of  linear  and  quadratic  approximating  polynomials,  we  have 


6  si  linear  4 
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and 
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i/e00*)r 


(4) 


where  e  —  s| linear  and  e  — s|quadratic  are  the  segment  widths  for  linear  and  quadratic  approxi¬ 
mations,  respectively.  We  have  chosen  to  replace  maxs<x<e  \f^n+1\x)\  and  maxs<j;<e  \f"'(x)\ 
by  the  abbreviations  \f'J_s(x*)\  and  \f'J!_s(x*)\,  respectively,  recognizing  that  if  the  appro¬ 
priate  derivative  is  continuous  on  a  closed  interval,  then  the  maxima  above  will  each  be 
attained  at  some  point  x*  within  that  interval. 


4.  The  Segmentation  Algorithm 
4.1.  Introduction 

The  algorithm  is  shown  in  Table  1.  It  applies  to  polynomial  approximations  of  any 
order.  We  assume  that  the  function  domain  is  represented  by  a  vector  of  N  discrete  points. 
For  example,  if  the  interval  is  [0, 1)  and  the  accuracy  is  8  bits,  then  we  may  choose  N  to 
be  256,  and  the  points,  in  binary  to  be  0.0000  OOOO2,  0.0000  OOOI2,  0.0000  OOIO2,  ...  ,  and 
0.1111  11112-  That  is,  the  domain  in  this  example  is  the  vector  [0.000,  0.0039,  0.0078, . . . , 
0.9961].  This  assumption  is  consistent  with  the  algorithm’s  implementation  in  MATLAB 
[3].  In  MATLAB,  we  associate  this  vector  with  variable  x.  f(x),  in  MATLAB,  is  then  a 
vector  of  elements  corresponding  to  the  function  evaluated  at  each  of  the  elements  in  x. 
Therefore,  f(x)  also  has  N  elements. 

Definition  1.  For  a  given  function,  a  step  in  a  segmentation  algorithm  is  a  computation 
of  the  maximum  absolute  error  between  the  function  and  its  approximating  polynomial  on 
the  proposed  segment. 

Because  so  much  computation  time  occurs  in  the  calculation  of  the  maximum  absolute 
error,  a  step,  as  defined  in  Definition  1,  is  an  appropriate  measure  of  the  execution  time. 
We  compare  the  number  of  steps  needed  in  the  proposed  algorithm  with  the  number  of 
steps  needed  in  a  brute  force  method. 

In  the  brute  force  segmentation,  the  beginning  point  of  the  first  segment  is  chosen 
to  be  the  leftmost  point  in  the  interval  of  approximation;  i.e.  aqow.  Then,  the  second 
point  and  successive  points  are  chosen  as  prospective  end  points,  and,  for  each  choice, 
the  error  between  the  function  and  its  approximating  polynomial  is  computed.  When 
this  error  exceeds  the  approximation  error  £,  the  exact  segment  width  has  been  found; 
it  corresponds  to  the  end  point  just  before  the  end  point  that  resulted  in  an  error  that 
exceeded  e.  In  the  brute  force  method,  all  but  the  leftmost  point  in  the  interval  is  a 
prospective  end  point  at  which  the  error  between  the  function  and  its  approximating 
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Table  1 

Algorithm  to  segment  a  given  function  based  on  estimates  of  the  segment  length 

Algorithm  1:  Segment  the  domain  [a;iow,  a^high]  of  a  given  function  f(x),  where 
f(x )  is  approximated  in  each  segment  by  a  polynomial  cnxn  +  •  •  •  +  c\x  +  cq. 
Input:  Function  f(x),  domain  [xiow,  Xhigh],  approximation  error  e,  and  order 
of  the  approximating  polynomial,  n. 

Output:  Optimum  segmentation,  in  which  the  i-th  segment  is  specified  as 
[si,ej],  where  s*  and  e*  are  the  beginning  and  end  point,  respectively. 

1.  X  <  1.  *  X\ow. 

ESTIMATE 

2.  Estimate  the  current  segment  width  determining  end  point  eest  and  appro¬ 
ximation  error  £est.  If  eest  >  a:high,  then  eest  <-  a:high.  If  eest  =  a;high  and  £est  <  £, 
then  STOP  with  e*  <—  eest. 

: LOCATE 

4.  If  eest  <  £,  then  find  upper  and  lower  bounds  H  and  L  on  the  segment  end 
point  with  the  property 

a)  £l  <  ^  where  £h  and  £l  are  the  approximation  errors  for  the  seg¬ 

ments  [si,  H]  and  [sj,L],  respectively.  Go  to  Step  5. 

b)  £h  <  £  and  H  =  a;high.  STOP  with  e*  eest. 

If  £est  >  £,  then  find  upper  and  lower  bounds  H  and  L  on  the  segment  end 
point. 

PINPOINT  "  "  "  " 

5.  Using  H  and  L,  produce  Hpp  and  Lpp  with  the  property  £lpp  <  £  <  £hpp, 
where  £hpp  and  £lpp  are  the  approximation  errors  for  the  segments  [s8,  Hpp \ 
and  [sj,  Lpp],  respectively  that  are  adjacexit  points  above  and  below  the  opti¬ 
mum  segment  width.  Choose  the  segment  end  point  e*  to  be  Lpp. 

6.  Sj+i  point  above  et.  i  <—  i  +  1.  Go  to  Step  2. 


polynomial  is  computed.  Therefore,  approximately  N  steps  are  needed,  where  N  is  the 
number  of  points  to  represent  the  function  in  the  whole  interval  of  approximation. 

The  algorithm  proceeds  from  the  smallest  value  in  the  domain  xiow  to  the  largest  Xhigh- 
It  establishes  the  largest  segment,  starting  at  X]ow,  such  that  the  maximum  approximation 
error  is  £.  It  repeats  this  process  starting  at  e±,  the  end  point  of  the  first  segment,  until 
it  reaches  Xhigh-  Often,  the  last  segment  is  truncated  because  Xhigh  is  reached  before  a 
segment  end  occurs  (where  the  approximation  error  is  e).  As  a  result,  it  is  not  unusual 
for  the  last  segment  to  have  a  maximum  approximation  error  strictly  less  than  £. 

Fig.  2  shows  an  example  segmentation.  The  vertical  axis  plots  the  function  value  f(x), 
while  the  horizontal  axis  plots  x.  Xiow  is  the  left-hand  end  of  the  interval  over  which  f(x) 
is  realized,  and  Xhigh  is  the  right-hand  end  of  the  interval. 

4.2.  Three  Parts  to  the  Algorithm 

There  are  three  parts  to  the  algorithm. 

In  the  first  part,  ESTIMATE,  the  segment  width  is  estimated.  The  process  of  estimation 


6 


J.  Butler,  C.  Frenzen,  N.  Macaria,  and  T.  Sasao 


Figure  2.  Example  segmentation. 

is  discussed  in  the  next  section.  Using  the  estimated  segment  width,  an  end  point  is  found 
and  the  approximation  error  for  the  proposed  segment  is  computed.  This  counts  as  one 
step. 

In  the  second  part,  LOCATE,  two  points  in  the  domain  are  located  such  that  one  point 
yields  a  segment  whose  approximation  error  is  just  below  (or  equal  to)  e,  and  the  other 
point  yields  a  segment  whose  approximation  error  is  just  above. 

This  is  accomplished  as  follows:  from  the  estimated  segment  width  computed  in  ES¬ 
TIMATE,  it  is  known  whether  the  corresponding  point  is  above  the  optimum  segment 
width  or  below  (or  equal).  If  above,  LOCATE  proceeds  towards  lower  values  of  x  search¬ 
ing  for  two  points  that  straddle  the  optimum  segment  width.  If  below,  LOCATE  proceeds 
towards  higher  values.  Assume  the  point  is  below.  The  algorithm  proceeds  toward  the 
optimum  segment  width  by  one  point  initially.  It  computes  the  approximation  error  of 
the  new  segment,  adding  1  to  the  number  of  steps.  If  the  approximation  error  exceeds  e, 
ESTIMATE  stops.  It  has  found  two  points  on  each  side  of  the  optimum  segment  width. 
Indeed,  they  are  adjacent  and  the  algorithm  stops;  there  is  no  need  to  proceed  to  the  next 
step,  PINPOINT. 

However,  if  the  approximation  error  is  still  less  than  (or  equal  to)  e,  the  algorithm 
advances  two  points.  Again,  it  computes  the  approximation  error,  adding  1  to  the  number 
of  steps,  and  repeats  the  process  above.  The  is  repeated  with  the  algorithm  advancing  four, 
eight,  etc.  points,  until  two  points  are  found  that  are  on  each  side  of  the  optimum  segment 
width.  If  a  total  of  m  steps  are  taken,  the  algorithm  has  advanced  1  +  2  +  4  + . . .  +  2m_1  = 
2m  —  1  points.  At  the  end  of  ESTIMATE,  the  last  two  points  considered,  H  and  L, 
correspond  to  end  points  of  segments  that  straddle  the  exact  end  point  of  the  segment 
Specifically,  H  is  the  end  point  of  a  segment  in  which  the  error  achieved  is  either  greater 
than  e,  and  L  is  the  end  point  of  a  segment  in  which  the  error  achieved  is  less  than  or 
equal  to  e. 

In  the  third  part,  PINPOINT,  a  bisection  method  is  applied  to  H  and  L.  That  is,  the 
midpoint  A  =  ;h±L  js  computed.  Then,  a  new  segment  whose  end  point  is  A  is  created, 
and  its  approximation  error  is  computed.  If  this  exceeds  e,  then  H  is  replaced  by  A  and 
the  process  is  repeated.  If  this  is  less  than  or  equal  to  e,  then  L  is  replaced  by  A  and  the 
process  is  repeated.  Each  time  a  new  approximation  error  is  computed,  the  number  of 
steps  is  increased  by  1.  The  process  stops  when  the  H  and  L  are  adjacent.  The  segment 
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end  point  is  chosen  to  be  L,  since  the  maximum  error  in  the  segment  ending  in  L  is  less 
than  or  equal  to  e,  while  the  maximum  error  in  the  segment  ending  in  H  exceeds  e. 

4.3.  Number  of  Steps 

Because  of  the  way  Algorithm  1  is  constructed,  the  difference  between  H  and  L  is  2m. 
We  have  the  following  lemma. 

Lemma  1.  Let  the  number  of  points  between  the  high  and,  low  point  H  —  L  be  a  power 
of  2,  2m.  For  all  but  the  last  segment,  the  average  and  the  worst  case  number  of  steps 
-^pinpoint  (m)  required  by  PINPOINT  is  m.  No  steps  are  required  by  PINPOINT  to 
compute  the  last  segment. 


Proof  The  proof  is  by  induction  on  m.  For  m  =  0,  H  and  L  are  adjacent,  and  no  further 
steps  are  needed.  Assume  the  hypothesis  is  true  for  all  m  <  m',  and  consider  H  —  L  —  2m  . 
There  is  one  step  required  to  compute  the  approximation  error  for  a  segment  that  ends 
at  P  =  Either  H  or  L  is  replaced  by  P,  and  the  problem  is  one  of  determining  the 

number  of  steps  needed  to  compute  the  segment  end  point  between  (the  new)  H  and  L. 
Since  H  —  L  =  2m~1,  from  the  assumption,  ml  —  1  steps  are  needed,  for  a  total  of  ml 
steps. 

No  steps  are  required  by  PINPOINT  to  compute  the  last  segment  because  H  is  Xhigh 
and  the  error  associated  with  a  segment  end  point  of  H  is  equal  to  or  less  than  s. 

I 

Similarly,  the  number  of  steps  required  by  LOCATE  can  be  calculated,  as  shown  in  Lemma 
2.  We  begin  with  a  definition. 

Definition  2.  A  truncated  segment  is  a  segment  whose  estimated  end  point  is  greater 
than  Xhigh- 


For  each  segment,  the  algorithm  in  Table  1  provides  an  estimated  segment  end  point  that 
is  used  to  start  the  search  for  the  exact  end  point.  A  truncated  segment  has  the  property 
that  its  estimated  end  point  is  greater  than  a;high.  Often,  a  truncated  segment  occurs  as 
the  last  (rightmost)  segment  in  a  segmentation. 

Interestingly,  a  truncated  segment  is  not  necessarily  the  last  segment.  For  example, 
suppose  that  in  a  linear  approximation  of  the  function,  the  function  is  nearly  linear 
throughout  most  of  the  interval,  except  near  the  end.  In  this  case,  the  segment’s  proposed 
end  point  may  reach  the  end  point  of  the  interval  of  approximation  (especially,  if  the 
segment  is  near  the  interval  end  point).  Thus,  it  may  be  a  truncated  segment.  However, 
when  PINPOINT  is  applied,  the  exact  end  point  may  be  found  to  be  an  internal  point. 
Since  the  next  segment  might  be  in  a  highly  non-linear  part  of  the  domain,  the  segment  is 
necessarily  narrow,  and  its  end  point  may  not  reach  the  interval’s  end  point.  Therefore, 
subsequently  constructed  segments  may  be  non-truncated.  In  the  course  of  generating 
the  experimental  data,  our  proposed  algorithm  encountered  this  phenomena. 

Lemma  2.  The  number  of  steps  required  to  construct  a  non-truncated  segment  in  LO¬ 
CATE,  NlocateChi),  is  NpiNPoiNT(nr)  +  2. 
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Proof  The  proof  is  by  induction  on  m.  For  m  =  0,  H  and  L  are  adjacent,  and  PINPOINT 
requires  no  steps.  The  approximation  error  associated  with  segments  whose  end  points 
are  H  and  L  require  a  total  of  two  steps.  Therefore,  ^pinpoint  (0)  =  0  and  Allocate (0)  = 
2.  Assume  the  hypothesis  is  true  for  all  m  <  m! ,  and  consider  ml .  It  follows  that 
Allocate (w-')  =  Allocate (m1  -  1)  +  1.  The  hypothesis  follows. 

I 

For  the  last  segment,  LOCATE  requires  some  number  of  steps  before  it  is  determined  that 
H  =  iChigh-  At  this  point,  if  the  approximation  error  with  H  as  the  segment  end  point  is 
equal  to  or  less  than  e,  then  it  is  established  that,  indeed,  this  is  the  last  segment.  No 
steps  are  needed  by  PINPOINT. 

In  the  best  case,  ESTIMATE  produces  a  segment  end  point  that  is  no  more  than  one 
step  away  from  the  optimum  segment  end  point.  This  requires  one  step.  To  verify  this 
and  thus  terminate  the  segment  construction,  another  step  is  required,  for  a  total  of  two 
steps  per  segment.  From  the  discussion  above,  a  truncated  segment  may,  in  the  best  case, 
require  only  one  step.  Therefore,  we  have 

Lemma  3.  At  least  2s  —  1  steps  are  needed  to  segment  a  domain,  where  s  is  the  number 
of  segments  in  the  segmentation. 

Lemma  3  assumes  that  the  estimates  of  segment  length  are  as  accurate  as  possible. 
As  N,  the  number  of  points  in  the  domain,  becomes  large,  then  the  percentage  of  steps 
needed  compared  to  the  brute  force  method  approaches  0.  The  program  shows  a  clear 
tendency  to  lower  percentage  of  steps  as  N  increases. 

5.  Artifacts  Associated  With  the  Use  of  Different  Accuracies 

5.1.  A  Conundrum 

Intuition  suggests  that  using  many  points  (e.g.  10,000,000)  to  represent  an  interval  of 
approximation  [xiow,  ^high]  yields  a  more  accurate  segmentation  than  when  fewer  points 
are  used  (e.g.  256).  Thus,  one  expects  the  segments  to  be  narrower  (or  the  same)  when 
fewer  points  are  used.  On  the  contrary,  if  the  segments  are  wider,  the  approximation  error 
will  be  greater  than  e.  Therefore,  one  expects  more  segments  (or  the  same)  are  needed 
when  there  are  fewer  points  to  represent  the  interval  of  approximation. 

However,  this  is  not  the  case.  Table  2  shows  the  number  of  segments  needed  to  realize 
three  functions,  —ln(x),  —  (xlog2x  +  (1  —  x)  log2(l  —  x)),  and  sin(ex ),  using  8-bits 
of  precision  and  a  linear  approximation2.  There  are  two  cases,  N  =  256  and  N  = 
10,  000,  000.  For  all  three  functions,  the  number  of  segments  for  N  =  10, 000, 000  is  larger 
than  for  N  =  256. 

5.2.  Resolution 

In  the  algorithm  shown  in  Table  1,  the  beginning  point  of  a  segment  is  the  next  point 
after  the  end  point  of  the  previous  segment  (not  the  same  point  at  which  the  last  segment 
ends).  This  recognizes  that  each  point  belongs  to  exactly  one  segment.  Thus,  there  is 
”  space”  between  segments  which  need  not  be  realized  by  a  polynomial.  This  is  benign; 
there  are  no  input  combinations  that  correspond  to  values  in  this  space.  For  example,  in 
the  case  of  sin(ex),  there  are  27  segments,  and  thus,  26  spaces  between  segments.  Since 

2^/—ln{x),  —  (adog2  x  +(1— x)  log2(l— &)),  and  sin{ex )  were  considered  in  [8],  [6],  and  [11],  respectively. 
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Table  2 

Three  functions  which  require  fewer  segments  when  N  =  256  than  when  N  =  10,  000,  000 
for  a  linear  piecewise  approximation. 


Function 

fix) 

Inter¬ 
val  X 

No.  oi 

N  =  256 

f  Segs. 

N=  107 

y/-ln(x) 

UL:.l) 

12 

14 

-(*  log2*+ 
(1-®)  log2(l-ir)) 

(0A) 

19 

20 

sin(ex) 

M 

27 

28 

only  256  points  represent  the  segment,  more  than  26/256  of  the  interval  is  not  realized 
in  the  approximation.  This  effectively  shortens  the  interval  by  about  10%.  In  the  case 
of  IV  =  10,  000,  000,  the  space  between  segments  is  a  much  smaller  fraction  of  the  total 
interval  width.  This  effect  dominates  and  is  the  reason  that  fewer  points  yields  fewer 
segments  in  Table  1. 

6.  Experimental  Results 
6.1.  Benefits  of  Estimates 

To  analyze  the  benefit  of  estimates  in  the  proposed  algorithm,  we  configured  a  MATLAB 
program  to  apply  only  LOCATE  and  PINPOINT  in  constructing  each  segment.  That 
is,  estimates  were  not  used  in  specifying  a  prospective  end  point  of  the  next  segment. 
Instead,  the  initial  end  point  was  chosen  to  be  just  beyond  the  beginning  point  of  the 
newly  constructed  segment.  In  this  case,  LOCATE  and  PINPOINT  must  search  over  the 
full  segment.  Table  3  shows  how  this  compares  to  the  brute  force  method  when  applied  to 
a  suite  of  15  functions  for  £  =  2-1'.  Each  entry  represents  the  ratio  of  the  number  of  steps 
needed  to  compute  the  segmentation  using  the  proposed  algorithm  to  the  number  of  steps 
needed  by  the  brute  force  method.  This  is  expressed  as  a  percentage.  The  values,  shown 
in  the  column  labeled  of  Estimates  =  0,  range  from  0.7244%  for  1+]-x  to  10.1860% 
for  sin(ex).  This  shows  that  LOCATE  and  PINPOINT  realize  a  significant  reduction  over 
the  brute  force  method.  For  13  of  the  15  functions,  the  ratios  are  less  than  5.0%,  which 
is  a  significant  reduction  in  the  number  of  steps. 

However,  estimates  provide  still  further  improvement.  Table  3  shows  the  benefits  of  1,  2, 
and  3  estimates.  The  column  labeled  of  Estimates  =  1  shows  that,  when  one  estimate 
is  used,  the  number  of  steps  is  reduced  by  as  much  as  one-fifth  that  needed  in  the  case  of 
one  estimate.  For  example,  in  the  case  of  the  entropy  function  —{x  log2  xT(l-a;)  log2(l-a;)) 
no  estimate  yields  a  percentage  of  7.7408%,  while  one  estimate  achieves  a  percentage  of 
1.3785%,  which  is  1/5.6  of  the  number  of  steps. 

In  the  case  of  one  estimate,  the  beginning  point  of  the  segment  is  used  to  determine 
an  estimate  for  the  segment  width.  For  example,  when  linear  approximation  is  used,  the 
second  derivative  of  the  new  segment  beginning  point  is  computed  and  substituted  into 
(3)  to  derive  an  estimate  for  the  segment  width.  Then,  a  proposed  end  point  is  obtained 
by  adding  the  estimated  segment  width  to  the  beginning  point.  The  approximation  error 
is  computed  and  used  to  determine  in  which  direction  from  the  estimated  end  point 
LOCATE  should  search. 

The  next  column  labeled  of  Estimates  =  2  shows  the  benefit  of  two  estimates.  In 
this  case,  the  estimate  of  the  segment  width  computed  with  the  first  step  in  the  segment 
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Table  3 

Percentage  of  steps  (compared  to  brute  force)  required  to  segment  functions  approximated 
by  linear  polynomials  using  different  estimates  of  segment  width  for  N  =  216  and  £  =  2-1' . 


Function 

Inter- 

%  of  Steps 

Min. 

#  of 

f(x) 

val  x 

VS. 

Brute  Force 

% 

Segs 

0 

1 

2 

3 

2X 

[0,1) 

2.28 

0.46 

*0.23 

*0.23 

0.23 

75 

1/x 

[L2) 

2.34 

0.75 

*0.23 

*0.23 

0.23 

75 

y/x 

[1,2) 

1.19 

0.46 

*0.11 

*0.11 

0.11 

35 

i/vT 

[1,2) 

1.62 

0.62 

*0.15 

*0.15 

0.15 

50 

l°g2(A) 

[1,2) 

2.35 

0.67 

*0.23 

*0.23 

0.23 

76 

In  x 

[1,2) 

2.00 

0.60 

*0.19 

*0.19 

0.19 

63 

sin(irx) 

[o,|) 

3.16 

0.71 

0.38 

0.35 

0.33 

109 

cos(  ttx) 

[0,|) 

3.15 

0.70 

0.35 

*0.33 

0.33 

109 

tan(irx) 

[0,1) 

2.25 

0.83 

0.27 

0.25 

0.22 

73 

\J  ln[x) 

[&>£] 

4.87 

1.36 

*0.63 

*0.63 

0.63 

207 

tan2  (ttx)  +  1 

[o,i) 

4.25 

0.82 

*0.46 

*0.46 

0.46 

152 

(x  log 2  X+ 
(1—x)  log2(l— a:)) 

r  1  255] 

1-256’  256-1 

7.74 

1.38 

*0.96 

*0.96 

0.96 

314 

1 

l+e~x 

[0,1) 

0.72 

0.37 

0.14 

0.10 

0.06 

20 

1  n-1 

—£=e  2 

y/2ir 

[0,\/2] 

2.32 

0.84 

0.38 

0.30 

0.23 

53 

sin(ex ) 

[0,2) 

10.19 

2.05 

1.43 

1.40 

1.35 

449 

*  All  segments  require  the  fewest  steps. 

(discussed  in  the  previous  paragraph)  is  averaged  with  the  estimate  of  the  segment  width 
computed  with  the  segment  end  point,  as  estimated  from  the  first  step.  This  approach 
is  based  on  the  assumption  that  the  average  of  two  estimates,  one  at  the  beginning 
and  one  near  the  end  of  the  proposed  segment  provides  a  better  estimate  of  the  actual 
segment  width  than  one  estimate  alone.  As  can  be  seen  in  Table  3,  two  estimates  provides 
substantial  reduction  in  the  number  of  steps.  Indeed,  for  9  of  the  15  functions,  the 
minimum  number  of  steps  is  achieved  (where  the  minimum  was  not  achieved  for  any  of 
the  15  function  in  the  case  of  one  estimate).  An  asterisk  indicates  that  this  percentage 
is  the  best  that  can  be  obtained,  as  shown  in  Lemma  3.  The  reduction  in  the  number  of 
steps  achieved  by  using  two  estimates  instead  of  one  ranges  from  1/1.4  to  1/4.4. 

The  next  column  labeled  #  of  Estimates  =  3  shows  the  benefit  of  three  estimates. 
In  this  case,  the  final  estimate  is  the  average  of  three  estimates,  one  from  the  beginning, 
one  from  the  end,  and  one  from  the  middle  of  the  segment  whose  width  is  estimated  from 
the  first  point  in  the  segment.  Now,  10  of  the  15  functions  achieve  the  minimum  number 
of  steps. 

The  column  labeled  Min  %  shows  a  percentage  that  represents  the  minimum  number 
of  steps  required  if  the  estimates  were  perfect,  as  specified  by  Lemma  3.  Comparing  this 
with  the  column  labeled  #  of  Estimates  =  3  shows  that,  even  for  the  five  functions  that 
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did  not  achieve  a  minimum  number  of  steps,  the  number  of  steps  is  close  to  minimum. 
Four  of  the  five  functions  are  within  30%,  while  one  ,  , 1  is  within  72%. 

Table  4  shows  the  detail  of  a  segmentation  of  the  domain  of  two  functions,  sin( nx) 
and  cos(nx)  using  8  bit  precision  with  a  given  approximation  error  of  2-9.  Specifically, 
this  shows  the  number  of  steps  in  each  of  the  seven  segments  required  to  compute  the 
segment  width.  It  also  shows  the  number  of  steps  per  segment  if  the  brute  force  method 
was  used.  As  discussed  earlier,  this  tracks  the  segment  width.  Note  that  the  number  of 
steps  per  segment  in  the  sin( nx)  function  is  monotone  decreasing.  This  is  because  the 
segment  width  decreases  with  increasing  x,  as  sin( nx)  becomes  increasingly  nonlinear. 
In  a  similar  way,  the  number  of  steps  in  the  cos{nx )  function  is  monotone  increasing.  It 
is  interesting  that  the  number  of  steps  to  compute  the  segmentation  for  sin( nx)  is  much 
greater  than  for  cos( nx).  For  e  =  2“9,  sin( nx)  requires  20.4%  of  the  total  number  of  steps 
required  by  the  brute  force  method,  while  cos( ttx),  requires  only  6.5%.  Indeed,  most  of 
the  additional  steps  in  sin( nx)  over  cos(nx)  occur  in  the  first  segment. 


Table  4 

Number  of  steps  needed  to  segment  sin(nx)  and  cos(nx )  using  linear  approximation.* 


Function 

%  Steps 
of  Brute 
Force 

1 

2 

Segm 

3 

ent  Nur 
4 

nber 

5 

6 

7 

sin(nx ) 

20.4% 

>16,440 

<2,184 

<893 

<481 

<277 

<149 

>1 

#  £ 

>teps/Seg. 

25,133 

15,963 

13,617 

12,451 

11,790 

11,424 

9,622 

cos(nx) 

6.5% 

>26 

>120 

>238 

>417 

>754 

>1,677 

>3,265 

#  £ 

>teps/Seg. 

11,280 

11,463 

11,865 

12,582 

13,858 

16,526 

22,426 

*For  8  bit  precision,  £  =  2  9  and  N  =  100,  000.  The  function  is  approximated  by  a  piecewise  linear 
polynomial  using  a  segment  width  estimate  from  the  beginning  of  the  segment. 


The  >  and  <  show  the  direction  the  algorithm  had  to  go  to  achieve  the  optimum  seg¬ 
mentation.  For  example,  the  entry  >  16,440  in  the  column  Segment  Number  =  1  and 
the  row  sin( nx)  means  the  estimate  was  short  for  the  leftmost  segment  in  the  segmen¬ 
tation  of  sin( nx),  and  it  was  necessary  to  increase  x  to  achieve  an  optimum  segment.  It 
used  16,440  steps  in  the  algorithm. 

Both  sin(nx)  and  cos {jtx)  require  the  same  number  of  segments,  as  would  be  expected. 
However,  there  is  a  significant  difference  in  the  number  of  steps.  This  is  because  the 
algorithm  begins  at  lower  values  of  x,  where  the  second  derivative  of  sin(nx )  is  0  and 
the  second  derivative  of  cos(nx)  is  n'2  (a  linear  approximation  is  used).  When  the  second 
derivative  is  0,  the  estimate  of  segment  width  associated  with  this  is  infinite.  In  the  al¬ 
gorithm,  a  large  segment  width  is  substituted  whose  value  is  computed  from  the  smallest 
non- zero  value  of  the  second  derivative  (over  the  whole  interval  of  approximation).  From 
the  data,  it  is  clear  that  this  estimate  is  far  away  from  the  actual  segment  width.  Con¬ 
sequently,  many  steps  are  needed  to  achieve  the  optimum  segment  width.  In  subsequent 
segments,  the  estimate  is  more  accurate  and  significantly  fewer  steps  are  needed.  In  the 
case  of  sin{nx ),  the  region  of  small  second  derivative  occurs  after  four  or  five  segments 
have  been  established.  Indeed,  the  data  indicates  that  the  last  segment  width  for  cos {jtx) 
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is  longer  than  needed,  in  which  case,  a  segment  width  that  takes  the  segmentation  to  the 
largest  x  value  is  taken,  and  only  one  step  in  needed. 

Table  4  shows  the  total  number  of  points  in  each  segment.  For  example,  for  Segment  1 
of  sin(nx),  25,133  points  occur,  which  means  the  brute  force  method  would  require  25,133 
steps.  Because  of  the  estimate,  however,  only  16,440  steps  are  needed  to  establish  the 
segment.  Included  in  this  number  are  2  steps,  one  to  cross  right  of  the  end  point,  where 
it  is  found  that  £  has  been  exceeded,  and  one  to  cross  back. 

Note  that  the  estimate  for  the  width  of  Segment  1  of  cos(nx)  is  much  better.  Only  26 
steps  are  needed  to  find  the  segment  end  point.  It  is  tempting  to  believe  that  the  number 
of  steps  associated  with  Step  i  of  sin(nx)  and  Step  8  —  %  of  cos( ttx)  should  be  the  same, 
for  1  <  i  <  7.  This  is  not  true  because  Segment  7  for  either  sin(nx)  or  cos(ttx)  is  not 
a  full  segment,  while  all  other  segments  are  full  segments.  That  is,  the  algorithm  moves 
from  left  to  right,  and  does  not  need  a  full  segment  for  Segment  7. 

We  note  that  N  =  100,  000  is  much  larger  than  one  would  normally  use  when  £  =  2~9. 
We  have  chosen  an  example  with  a  small  number  of  segments,  7,  for  clarity’s  sake  and  a 
number  of  points  more  representable  of  a  practical  segmentation.  Also,  the  function  was 
carefully  chosen  to  illustrate  how  the  number  of  steps  in  the  algorithm  depends  on  where 
the  function’s  second  derivative  is  0. 

Table  5  shows  the  detail  of  a  segmentation  for  sin(nx)  and  cos(7rx)  using  quadratic 
approximations.  Here,  16  bit  precision  is  achieved  with  a  given  approximation  error 
of  2-1'.  In  the  case  of  quadratic  approximation,  the  same  phenomena  seen  in  linear 
approximation  exists  but  to  a  lesser  extent.  In  quadratic  approximation,  a  third  derivative 
that  is  0  makes  the  estimate  inaccurate  versus  the  second  derivative  in  the  case  of  linear 
approximation.  The  roles  of  the  sin(nx)  and  cos(nx )  reverse,  as  the  third  derivative  of 
cos  (ttx)  is  0  at  x  —  0.  Here,  more  steps  are  required  for  the  cos( ttx)  than  for  the  sin(7rx), 
but  not  much  more. 


Table  5 


Number  of  steps  needed  to  segment  sin( ttx)  and  cos( ttx)  using  quadratic  approximation. 


Func¬ 

tion 

%  Steps 
of  Brute 
Force 

1 

2 

3 

4 

Sej 

5 

gment 

6 

Numl 

7 

)er 

8 

9 

10 

11 

12 

sin(Trx) 

0.2242% 

>6 

<8 

<10 

<12 

<12 

<14 

>14 

<16 

>16 

>18 

<20 

>1 

#  s 

iteps/Seg. 

4741 

4761 

4804 

4871 

4967 

5099 

5282 

5538 

5916 

6538 

7893 

5114 

cos{irx) 

I  0.2486% 

>22 

>20 

>18 

>16 

>16 

>14 

>14 

>12 

>12 

>10 

>8 

>1 

#  Steps/Seg. 

9654 

6997 

6157 

5691 

5388 

5176 

5022 

4911 

4831 

4778 

4747 

2172 

*For  17  bit  precision,  e  =  2-17  and  N  =  65,  535.  The  function  is  approximated  by  a  piecewise  quadratic 
polynomial  using  a  segment  width  estimate  from  the  beginning  of  the  segment.  The  Remez  Algorithm 
was  applied  once. 


6.2.  The  Benefit  of  LOCATE  and  PINPOINT 

Table  6  compares  the  functions  sin( ttx)  and  cos( ttx)  with  respect  to  the  number  of 
steps  required  in  each  segment.  For  both  functions,  the  first  row  shows  the  number  of 
steps  required  by  LOCATE  to  compute  each  segment.  The  second  row  shows  the  number 
of  steps  required  by  PINPOINT.  The  third  row  shows  the  total  of  these  two.  The  fourth 


An  fast  segmentation  algorithm 


13 


row  shows  the  total  number  of  points  in  each  segment,  out  of  a  total  of  N  =  5,  000,  000. 
We  have  used  many  more  points  to  represent  the  interval  of  approximation  than  would 
normally  be  used  when  £  =  2-9  to  illustrate  the  case  of  high  precision  with  a  tractable 
number  of  segments.  The  total  number  of  steps  in  each  segment  is  also  the  number  of 
steps  the  brute  force  method  would  take  to  compute  that  segment.  It  is  noteworthy  that 
Algorithm  1  requires  only  0.0028%  and  0.0025%  of  the  steps  needed  by  the  brute  force 
method  for  the  sin( nx)  and  cos(nx )  functions,  respectively.  This  also  shows  the  merit 
of  estimates.  That  is,  the  use  of  an  estimate  reduces  the  number  of  steps  by  roughly 
one-half. 

Note  that  when  an  estimate  is  used,  the  sin( nx)  function  requires  more  steps  than  the 
cos( nx)  function.  This  is  clue  to  the  inaccuracy  of  the  estimate  when  the  second  derivative 
is  0,  which  occurs  in  the  first  (leftmost)  segment  of  sin(nx )  and  in  the  last  (rightmost) 
segment  of  cos(nx).  Because  of  this,  the  first  segment  of  the  sin( nx)  function  requires 
many  more  steps,  34,  than  any  other  segment.  Since  the  same  point  occurs  as  the  last 
segment  of  cos  (nx),  it  is  a  truncated  segment,  and  the  inaccuracy  has  a  much  less  effect 
on  the  number  of  steps  needed  1.  It  follows  that  the  number  of  steps  required  by  the 
algorithm  is  dependent  where  the  second  derivative  is  0. 


Table  6 


Number  of  steps  needed  to  segment  sin(nx)  and  cos(nx)  using  estimates  from  three 
points*. 


%  Steps 

Operation 

of  Brute 

Segment  Number 

Force 

1 

2 

3 

!  4 

5 

6 

7 

No  Estimate 

sin{irx ) 

LOCATE 

22>. 

21> 

21> 

21> 

21> 

21> 

20> 

PINPOINT 

20>. 

19> 

19> 

19> 

19> 

19> 

0> 

TOTAL 

0.0052% 

42>. 

40  > 

40  > 

40> 

40  > 

40  > 

20> 

|  #  Steps/Seg. 

1,256,637 

798,153 

680,848 

622,519 

589,489 

571,189 

481,165 

No  Estimate 

cos(7ra;) 

LOCATE 

21> 

21> 

21> 

21> 

21> 

21> 

22> 

PINPOINT 

19> 

19> 

19> 

19> 

19> 

19> 

0> 

TOTAL 

0.0052% 

40> 

40  > 

40  > 

40> 

40  > 

40  > 

22> 

|  #  Steps/Seg. 

563,989 

573,146 

593,248 

629,071 

692,868 

826,274 

1,121,404 

|  Three  Estimates 

sin{irx) 

i  ! 

LOCATE 

18<. 

13< 

12< 

11< 

11< 

11< 

1> 

PINPOINT 

16<. 

11< 

10< 

9< 

9< 

9< 

0> 

TOTAL 

0.0028% 

34<. 

24< 

22< 

20< 

20< 

20< 

1> 

|  #  Steps/Seg. 

1,256,637 

798,153 

680,848 

622,519 

589,489 

571,189 

481,165 

|  Three  Estimates 

cos(ttx) 

LOCATE 

11< 

11< 

11< 

11< 

12< 

12< 

1> 

PINPOINT 

9< 

9< 

9< 

9< 

10< 

10< 

0> 

TOTAL 

0.0025% 

20< 

20< 

20< 

20< 

22< 

22< 

1> 

|  #  Steps/Seg. 

563,989 

573,146 

593,248 

629,071 

692,868 

826,274 

1,121,404 

*For  e  =  2  9  and  N  =  5, 000,  000,  and  estimates  from  three  points. 


There  is  a  nearly  linear  relationship  between  the  number  of  steps  and  the  number  of 
segments.  This  can  be  seen  in  the  scatter  plot  of  the  percentage  of  steps  compared  to 
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Figure  3.  Percent  of  the  steps  by  brute  force  versus  the  number  of  segments 


brute  force  versus  the  number  of  segments  of  Fig.  33.  A  reason  for  this  is  the  fact 
that  the  estimates  for  16-bit  precision  are  close  enough  that  most  functions  require  the 
fewest  steps  per  segment,  which  is  2.  For  example,  the  ^  function  requires  50  segments, 

which  is  0.1508%  of  the  steps  required  by  brute  force.  The  total  number  of  steps  is  99  ( 

_x2 

=  2  x  49  + 1),  which  is  the  minimum.  However,  the  Gaussian  function  ^=e  2  requires  53 
segments,  which  is  0.2950%  of  the  steps  required  by  brute  force,  slightly  less  than  twice 
that  of  the  function.  This  is  because  of  extra  steps  needed  near  x  —  1,  where  the 
function  is  close  to  linear,  and  the  estimates  are  less  accurate. 


7.  Concluding  Remarks 

We  have  given  a  segmentation  algorithm  that  efficiently  segments  a  given  numeric  func¬ 
tion,  such  as  sin( ttx),  in  such  a  way  that  the  polynomial  approximation  error  is  less  than 
some  given  value.  The  algorithm  requires  many  fewer  steps  than  a  previous  algorithm  [6]. 
Experimental  results  show  that,  in  some  instances,  only  the  absolute  minimum  number 
of  steps  is  needed.  In  most  instances,  it  requires  close  to  the  minimum  number  of  steps. 


3We  chose  23  bit  precision  instead  of  24  bit  precision  because  the  fraction  in  the  32-bit  single  precision 
IEEE  Floating  Point  Standard  (ANSI/IEEE  Stcl  75+1985)  is  23  bits. 
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